AITopics | batch normalization

Riemannian neural networks have proven effective in solving a variety of machine learning tasks. The key to their success lies in the development of principled Riemannian analogs of fundamental building blocks in deep neural networks (DNNs). Among those, Riemannian batch normalization (BN) layers have shown to enhance training stability and improve accuracy. In this paper, we propose BN layers for neural networks on complex domains. The proposed layers have close connections with existing Riemannian BN layers. We derive essential components for practical implementations of BN layers on some complex domains which are less studied in previous works, e.g., the Siegel disk domain. We conduct experiments on radar clutter classification, node classification, and action recognition demonstrating the efficacy of our method.

artificial intelligence, machine learning, neural network, (15 more...)

arXiv.org Machine Learning

2605.00467

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

48bea99c85bcbaaba618ba10a6f69e44-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 17:45:01 GMT

artificial intelligence, machine learning, subsequence, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

1c10d0c087c14689628124bbc8fa69f6-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 13:46:24 GMT

A.1 For LEHD model467 In Table 5, we explore the effects of eliminating normalization from the attention layer in our LEHD468 model. We train three LEHD models with the same training scheme and training budget, differing469 solely in the attention layer: one with batch normalization (BN), one with instance normalization470 (IN), and one without normalization (w/o). We train all three POMO models with the same reinforcement learning method477 with POMO strategy and training budget (1000 epochs). The results show that different types of478 normalization have few effects on the POMO model.479 The results in Table 6 show that removing normalization from attention layer has little impact on the480 model with a heavy encoder and a light decoder.

artificial intelligence, machine learning, node, (19 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback

37693cfc748049e45d87b8c7d8b9aacd-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 11:37:51 GMT

artificial intelligence, detector, machine learning, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

SPD domain-specific batch normalization to crack interpretable unsupervised domain adaptation in EEG

Neural Information Processing SystemsApr-25-2026, 04:56:56 GMT

Electroencephalography (EEG) provides access to neuronal dynamics noninvasively with millisecond resolution, rendering it a viable method in neuroscience and healthcare. However, its utility is limited as current EEG technology does not generalize well across domains (i.e., sessions and subjects) without expensive supervised re-calibration. Contemporary methods cast this transfer learning (TL) problem as a multi-source/-target unsupervised domain adaptation (UDA) problem and address it with deep learning or shallow, Riemannian geometry aware alignment methods. Both directions have, so far, failed to consistently close the performance gap to state-of-the-art domain-specific methods based on tangent space mapping (TSM) on the symmetric, positive definite (SPD) manifold. Here, we propose a machine learning framework that enables, for the first time, learning domain-invariant TSM models in an end-to-end fashion. To achieve this, we propose a new building block for geometric deep learning, which we denote SPD domain-specific momentum batch normalization (SPDDSMBN). ASPDDSMBN layer can transform domain-specific SPD inputs into domain-invariant SPD outputs, and can be readily applied to multi-source/-target and online UDA scenarios. In extensive experiments with 6 diverse EEG brain-computer interface (BCI) datasets, we obtain state-of-the-art performance in inter-session and -subject TL with a simple, intrinsically interpretable network architecture, which we denote TSMNet.

artificial intelligence, machine learning, normalization, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.68)
Asia > Japan > Honshū (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

26cd8ecadce0d4efd6cc8a8725cbd1f8-Paper.pdf

Neural Information Processing SystemsApr-25-2026, 04:21:45 GMT

artificial intelligence, machine learning, representation, (15 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

043ab21fc5a1607b381ac3896176dac6-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 11:09:16 GMT

artificial intelligence, experiment, machine learning, (15 more...)

Neural Information Processing Systems

Country: Europe > France (0.16)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

0266d95023740481d22d437aa8aba0e9-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 05:30:55 GMT

accuracy, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Government (1.00)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Security & Privacy (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.67)

Add feedback

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Tim Salimans, Durk P. Kingma

Neural Information Processing SystemsApr-22-2026, 12:13:32 GMT

By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.

artificial intelligence, machine learning, normalization, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Swapout: Learning an ensemble of deep architectures

Saurabh Singh, Derek Hoiem, David Forsyth

Neural Information Processing SystemsApr-22-2026, 01:55:21 GMT

We describe Swapout, a new stochastic training method, that outperforms ResNets of identical network structure yielding impressive results on CIFAR-10 and CIFAR100.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)

Add feedback